ETC5521-Exploratory data analysis, Semester 2 (2021): Assignment 1

CEO departure in S&P (Standard & Poor’s) 1500 firms: An Analysis

This assignment is for ETC5521 Assignment 1 by Team comprising of Qin Xu, Raunak Bhivpathaki, Pian Li, and Nishtha Arora.

1 Introduction and motivation

The chief executive officers (CEO’s) and the management team play a vital role for the performance of the company as they provide strategic decisions that directs company in the direction of success. However, since the beginning of the early 1980s, there has been a series of studies that have argued that the effect of “CEO” has been largely over estimated [12].

In order to understand the CEO succession phenomena deeply, we carry an analysis involving CEO dismissals, their departure reasons, year of dismissal and more.The data set used for the analysis, is an open source data set which includes records for reasons of CEO departures in S&P 1500 firms from 1980 to 2021.

In order to carry out the analysis, we will answer the following research questions:

  • What is the most popular reason of CEO departure from 1987 to 2020 ?

  • Compare and contrast involuntary and voluntary CEO departures.

  • How many CEO with higher experience left companies from 1987 to 2020? Guess which company is more attractive.

  • Give details of the CEO’s for the company which has had maximum CEO’s over the years.

  • Has the no. of CEO departures increased over the years?

2 Data description

2.1 Source of the data

The data set has been extracted from TidyTuesday-RforDataScience github which further has been taken from Gentry et al.’s article available at Wiley Online Library. The original data source for the article is zenodo.org, which is a data center hub and contains open data sources for research.

2.2 Structure of the data

The data set contains the data for CEO details, their name, company name, year of departure, year of serving and dismissal reasons. The number of rows in the data set are 9423 and the number of columns are 19. The list of variable names with their description are mentioned below in Table 2.1:

2.2.1 Variable description and datatypes

Table 2.1: Data Dictonary
VARIABLES CLASS/ TYPE DESCRIPTION
dismissal_dataset_id double The primary key. This will change from one version to the next. gvkey-year is also a unique identifier
coname character The Compustat Company Name
gvkey double The Compustat Company identifier
fyear double The fiscal year in which the event occured
co_per_rol double The executive/company identifier from Execucomp
exec_fullname character The executive full name as listed in Execucomp
departure_code double The departure reason coded from criteria above
ceo_dismissal double A dummy code for involuntary, non-health related turnover (Codes 3 & 4).
interim_coceo character A descriptor of whether the CEO was listed as co-CEO or as an interim CEO (sometimes interim positions last a couple years)
tenure_no_ceodb double For CEOs who return, this value should capture whether this is the first or second time in office
max_tenure_ceodb double For this CEO, how many times did s/he serve as CEO
fyear_gone double An attempt to determine the fiscal year of the CEO’s effective departure date. Occasionally, looking at departures on Execucomp does not agree with the leftofc date that we have. They apparently try to balance between the CEO serving one month in the fiscal year against documenting who was CEO on the date of record. I would stick to the Execucomp’s fiscal year, departure indication for consistency with prior work
leftofc double Left office of CEO, modified occasionally from execucomp but same interpretation. The date of effective departure from the office of CEO
still_there character A date that indicates the last time we checked to see if the CEO was in office. If no date, then it looks like the CEO is still in office but we are in the process of checking
notes character Long-form description and justification for the coding scheme assignment.
sources character URL(s) of relevant sources from internet or library sources.
eight_ks character URL(s) of 8k filing from the Securities and Exchange Commission from 270 days before through 270 days after the CEO’s leftofc date which might relate to the turnover. Included here are any 8k filing 5.02 (departure of directors or principal executives) or simply item 5 if it is an older filing. These were collected without examining their content.
cik double The company’s Central Index Key
_merge character Merge details

The departure code is a double/numeric variable and hence, is a numeric digit (code value). The description for each digit is depicted below in Table 2.2:

Table 2.2: CEO Departure Code Description
CODE NUMBER TYPE DESCRIPTION
1 Involuntary - CEO death The CEO died while in office and did not have an opportunity to resign before health failed.
2 Involuntary - CEO illness Required announcement that the CEO was leaving for health concerns rather than removed during a health crisis.
3 Involuntary – CEO dismissed for job performance The CEO stepped down for reasons related to job performance. This included situations where the CEO was immediately terminated as well as when the CEO was given some transition period, but the media coverage was negative. Often the media cited financial performance or some other failing of CEO job performance (e.g., leadership deficiencies, innovation weaknesses, etc.).
4 Involuntary - CEO dismissed for legal violations or concerns The CEO was terminated for behavioral or policy-related problems. The CEO’s departure was almost always immediate, and the announcement cited an instance where the CEO violated company HR policy, expense account cheating, etc.
5 Voluntary - CEO retired Voluntary retirement based on how the turnover was reported in the media. Here the departure did not sound forced, and the CEO often had a voice or comment in the succession announcement. Media coverage of voluntary turnover was more valedictory than critical. Firms use different mandatory retirement ages, so we could not use 65 or older and facing mandatory retirement as a cut off. We examined coverage around the event and subsequent coverage of the CEO’s career when it sounded unclear.
6 Voluntary - new opportunity (new career driven succession) The CEO left to pursue a new venture or to work at another company. This frequently occurred in startup firms and for founders.
7 Other Interim CEOs, CEO departure following a merger or acquisition, company ceased to exist, company changed key identifiers so it is not an actual turnover, and CEO may or may not have taken over the new company.
8 Missing Despite attempts to collect information, there was not sufficient data to assign a code to the turnover event. These will remain the subject of further investigation and expansion.
9 Execucomp error If a researcher were to create a dataset of all potential turnovers using execucomp (co_per_rol != l.co_per_rol), several instances will appear of what looks like a turnover when there was no actual event. This code captures those.

3 Data Exploration

3.1 Finding Outliers and Duplicates

After sub-setting the duplicate rows from the original data using R, it was found out that there are 0 duplicate data entries.

Further we find outliers in the time variables of the data set.

Finding outliers in fyear_gone

Figure 3.1: Finding outliers in fyear_gone

Figure 3.1 shows that there is an outlier in the fyear_gone column i.e. showing a year ‘2997’ which is not possible as the present year is ‘2021’. This needs to be removed.

Finding outliers in fyear

Figure 3.2: Finding outliers in fyear

Though figure 3.2 is not symmetrical but no outliers are detected here. The fyear variable ranges from 1987 to 2020, which is possible and correct.

3.2 Variable Description (after correction)

The outlier found in figure 3.1 is corrected i.e. the row with the year “2997” is removed and a cleaned data set is formed.

3.2.1 Variable datatypes visualization

All the variables are in the correct data types which are displayed below in 3.3:

Visualization for variable types

Figure 3.3: Visualization for variable types

It is observed that most of the variables are of numeric type. Most amount of NULL values are found in character type variables. The exact percentage of missing values for each variable is shown below in 3.4:

Missing values in Variables

Figure 3.4: Missing values in Variables

These missing values will be removed for the analysis as and when required.

3.3 Checking the data quality

The quality of the data set is checked using the Skim function in R. Below displayed is the summary of the data set, the no. of variables for each data type, missing, unique, minimum and maximum values for each variable and their mean and standard deviation (sd).

Table 3.1: Data summary
Name clean
Number of rows 9422
Number of columns 19
_______________________
Column type frequency:
character 8
numeric 10
POSIXct 1
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
coname 0 1.00 2 30 0 3859 0
exec_fullname 0 1.00 5 790 0 8700 0
interim_coceo 9104 0.03 6 7 0 6 0
still_there 7311 0.22 3 10 0 77 0
notes 1644 0.83 5 3117 0 7754 0
sources 1475 0.84 18 1843 0 7914 0
eight_ks 4498 0.52 69 3884 0 4914 0
_merge 0 1.00 11 11 0 1 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
dismissal_dataset_id 0 1.00 5684.55 25006.75 1 2306.25 4593.5 6812.75 559044 ▇▁▁▁▁
gvkey 0 1.00 40136.22 53922.97 1004 7343.00 14385.0 60900.75 328795 ▇▁▁▁▁
fyear 0 1.00 2007.74 8.19 1987 2000.00 2008.0 2016.00 2020 ▁▆▅▅▇
co_per_rol 0 1.00 25582.30 18202.22 -1 8559.50 22982.0 39275.75 64602 ▇▆▅▃▃
departure_code 1667 0.82 5.20 1.53 1 5.00 5.0 7.00 9 ▁▃▇▅▁
ceo_dismissal 1813 0.81 0.20 0.40 0 0.00 0.0 0.00 1 ▇▁▁▁▂
tenure_no_ceodb 0 1.00 1.03 0.17 0 1.00 1.0 1.00 3 ▁▇▁▁▁
max_tenure_ceodb 0 1.00 1.05 0.24 1 1.00 1.0 1.00 4 ▇▁▁▁▁
fyear_gone 1802 0.81 2006.51 7.55 1980 2000.00 2007.0 2013.00 2021 ▁▂▇▇▇
cik 245 0.97 741545.85 486522.47 1750 106413.00 857323.0 1050377.00 1808065 ▆▁▇▂▁

Variable type: POSIXct

skim_variable n_missing complete_rate min max median n_unique
leftofc 1802 0.81 1981-01-01 2021-12-01 2006-12-31 3626

In the figure below i.e.3.5 , histograms for each variable is displayed.

Variable values districution

Figure 3.5: Variable values districution

The histograms show depict the range of values/skewness of values for each variable, for example, the histogram gvkey is left skewed.

3.4 Time frame of collection

The 3.6 shows the time frame for the date of departure(fyear_gone) of the CEO.

Time frame for CEO departure event

Figure 3.6: Time frame for CEO departure event

We see that the data set has a single data value at 1980 and numerous values from 1988 to 2021.

The 3.7 shows the time frame for the year when the event took place (fyear).

Time frame for event occurence (CEO dismissal)

Figure 3.7: Time frame for event occurence (CEO dismissal)

It is seen that the data is collected from 1987 to 2020, with single data in 1987 and 2020 and numerous data rows for the years from 1992 to 2019.

Therefore, overall we can say that the data has been collected from 1980 to 2021

4 Data Collection Method

  • The data was collected by by Richard J. Gentry, Joseph S. Harrison, Steven Boivie and Timothy J, as a part of their research on the CEO turnover and their dismissals in S&P firms [5].

  • This data was collected from a range of different news coverage and SEC filings (Financial Statement submitted to Securities and Exchange Commission) from the web and university libraries.

  • Then the data was then coded by paid undergraduate students in a computer lab under the direct supervision of two strategy PhD students at a major university in the southeastern United States.

  • During th COVID-19 pandemic, the process of data coding was through a data collection company outside of the United States as in-person coding was not possible due to restriction in the United States.

5 Analysis

(Please use PLOTLY (click on the region/plot/shape) to get more information about the plots/figures in the report further from here)

5.2 Compare and contrast involuntary and voluntary CEO departures

This section will compare and analyze the involuntary and voluntary reasons for CEO departures between 1987 to 2020 for S&P 1500 companies.

Table 5.2: The average annual CEO turnover from 1987 to 2020
Type Total Average
voluntary 3781 118.16
involuntary 1696 60.57
  • The table shows that involuntary causes are responsible for 1696 CEO departures in the S&P 1500 company, while voluntary reasons are relatively higher with 3781 CEO departures from 1987 to 2020.

  • In addition, the average number of voluntary CEO departures is 119 per year, which is also significantly higher than involuntary CEO departure (55 per year).

Another figure 5.3 is shows to display the trend of voluntary and involuntary CEO departures from the year 1987 to 2020.

Figure 5.3: The number of involuntary and voluntary CEO departure from 187 to 2020

  • Here, it is evident that voluntary departures top involuntary departures by a decent margin. The first voluntary departure was in 1987, when only one such departure was reported, whereas the first involuntary departure was in 1992 when 11 instances were reported.

Voluntary Departures are instances when the CEO decides to step down from the position of his/her own will. It may happen due to the factors of retiring, disagreements over company decisions and policies, or personal reasons like illness or old age. Overall, it can be said that Voluntary Departures are not always due to any misconduct or mistake from the CEO’s end.

Involuntary Departures are instances when the CEO is fired by the company or due to the CEO’s death. A CEO may get fired for a variety of reasons like misconduct or harassment at the workplace, poor performance of the CEO, working against the company’s interests or going against company policies, etc. Involuntary Departures are generally considered bad for the CEO.

Therefore, figure 5.3 concludes that the highest number of voluntary departures were reported in 1999, where a total of 185 CEOs stepped down. On the other hand, the highest number of involuntary departures were reported in the year 2000, where 88 CEOs stepped down from their role.

However, the year with the overall most number of CEO departures is 2007, where 265 CEOs stepped down. It also may be because 2007 was the start of the Global Depression, which had hit corporations badly.

Further analysis of the graph tells us that the number of CEO departures is declining, particularly sharply after 2018.

5.3 How many CEO with higher experience left companies from 1987 to 2020? Guess which company is more attractive.

It is possible that people have more experience if they hold the same position more than once. This section talks about the number of experienced CEO’s who serve more than or equal to 2 times from 1987 to 2020.

Table 5.3: Total number of experienced CEO for each company
Company name Number of experienced CEOs
CONVERSANT INC 5
STEWART INFORMATION SERVICES 5
AGILYSYS INC 4
AMERICAN PACIFIC CORP 4
AWARE INC 4
CHIPOTLE MEXICAN GRILL INC 4
CORELOGIC INC 4
HANCOCK WHITNEY CORP 4
HARMAN INTERNATIONAL INDS 4
ORACLE CORP 4

The table above shows the total number of CEOs who serve as CEO greater than or equal to two times. It is obvious that CONVERSANT INC and STEWART INFORMATION SERVICES both have the highest total number of experienced CEOs who left the companies, which is five people.

Below the 5.4 shows the counted number of CEOs’ departures from different corporations, including how many times they served as CEOs. We will compare the eight companies that have a higher number of experienced CEOs.

Figure 5.4: Experienced CEO’s vs Company

From Table 3.3, the two companies with the highest total number of experienced CEOs, CONVERSANT INC and STEWART INFORMATION SERVICES, have three people who worked as CEO three times already, and two people are worked as CEOs two times. Furthermore, PHOTRONICS INC has more experienced CEOs, four people working as CEO four times already.

Therefore, based on table 3.3 and 5.4 above, another guess is that companies with more experienced CEOs potentially reflect the company’s development. Generally, more powerful companies can attract more experienced people. In this case, it is possible to conclude that PHOTRONICS INC is more attractive for more experienced CEOs. Additionally, the treatment for those important positions in the firm should be improved to retain talents.

5.4 Give details of the CEO’s for the company which has had maximum CEO’s over the years.

Finding company with most CEO’s

Top 7 Companies with most no. of CEO's

Figure 5.5: Top 7 Companies with most no. of CEO’s

From 5.5, we see that SEARS HOLDINGS CORP has the most number of CEO over the years i.e. 11 CEO’s.

Dismissal history of Sears Holdings Corp. CEO’s

Table 3.2 displays the list of 11 CEO executives of the Sears Holding Corp along with their Reason of Dismissal.

Table 5.4: CEO Dismissal history of the Sears Holdings Corp.
fyear fyear_gone exec_fullname departure_code tenure_no_ceodb Reason_of_Dismissal
1994 1995 Joseph E. Antonini 3 1 Job performance
1999 2000 Floyd Hall 5 1 CEO retired
2001 2002 Charles C. Conaway 4 1 Legal violations
2002 2002 James B. Adamson 8 1 Unknown reason
2003 2004 Julian C. Day 7 1 Due to merger, acquisition or other reasons
2004 NA Aylwin B. Lewis 9 1 Execucomp error
2005 2005 Alan J. Lacy 9 1 Execucomp error
2007 2008 Aylwin B. Lewis 3 2 Job performance
2010 2011 William Bruce Johnson 3 1 Job performance
2012 2013 Louis Joseph D’Ambrosio 7 1 Due to merger, acquisition or other reasons
2017 2018 Edward Scott Lampert 3 1 Job performance

In the table 3.2, the fyear_gone (year of CEO departure) for ‘Aylwin B. Lewis’ is displayed as ‘NA’ as the data is missing. But this value cannot be omitted as all the other variables are necessary for the analysis. Also, by observing tha table, it is likely that the missing value is ‘2004 or 2005’ as the next CEO, Alan J. Lacy joined in 2005.

Observation for Table 3.2:

  • It is observed that, most of the dismissals (4 out of 11) are due to Job performance . Out of these 4, 1 CEO has had one previous experience each.

  • This displays a poor research of background of the candidate and poor judgment of human resource, by the recruitment team.

5.5 Has the no. of CEO departures increased over the years?

The below plot shows a correlation between the year of departures and no. of departures over the years.

Year with maximum CEO departures

Figure 5.6: Year with maximum CEO departures

Observation from figure 5.6:

  • Here the R value i.e. the correlation coefficient is 0.45. This shows that is a moderate (on the weak side) positive linear.

  • This means that as the years pass by, the CEO departures have increased in numbers but the increase is not rapid and is happening slowly.

6 Limitation

  • Due to limitation of the data set: The small sample size collected was small and hence, this might have lead to less accurate results, impacting the final finding for the analysis.

  • CEO departures were coded manually. This might have generated coding errors, leading to incorrect conclusions

7 Conclusion

  • From 1987 to 2020, the top 3 reasons for CEO dismissals account for 92% of the total reasons which are job performance, CEO retirement and others. Maximum departures due to retirement happened in 1998 and due to job performance , happened in 2000.

  • There are more voluntary departures than involuntary departures, which means that more significant number of CEO’s stepped down on their own will by retiring, switching companies, or due to disagreements with the company and its policies.

  • The company PHOTRONICS INC has more experienced CEO’s departures who served as CEO four times. Besides, the CONVERSANT INC and STEWART INFORMATION SERVICES have the highest total number of experienced CEO’s departures, including two and three times of being CEO’s.

  • The maximum no. of CEO changeovers have happened in SEARS HOLDINGS CORP. There were total of 11 dismissals of CEO’s and most of them were due to job performance.

  • As the years pass by, the no. of CEO departures have increased slowly. And maximum no. of departures have happened in 2020.

8 References

[1] Alboukadel Kassambara (2020). ggpubr: ‘ggplot2’ Based Publication Ready Plots. R package version 0.4.0. https://CRAN.R-project.org/package=ggpubr

[2] C. Sievert. Interactive Web-Based Data Visualization with R, plotly, and shiny. Chapman and Hall/CRC Florida, 2020.

[3] Elin Waring, Michael Quinn, Amelia McNamara, Eduardo Arino de la Rubia, Hao Zhu and Shannon Ellis (2021). skimr: Compact and Flexible Summaries of Data. R package version 2.1.3. https://CRAN.R-project.org/package=skimr

[4] Garrett Grolemund, Hadley Wickham (2011). Dates and Times Made Easy with lubridate. Journal of Statistical Software, 40(3), 1-25. URL https://www.jstatsoft.org/v40/i03/.

[5] Gentry et al (2021). A database of CEO turnover and dismissal in S&P 1500 firms, 2000–2018. https://onlinelibrary.wiley.com/ https://onlinelibrary.wiley.com/doi/abs/10.1002/smj.3278

[6] Hadley Wickham and Jim Hester (2020). readr: Read Rectangular Text Data. R package version 1.4.0. https://CRAN.R-project.org/package=readr

[7] Hadley Wickham and Jennifer Bryan (2019). readxl: Read Excel Files. R package version 1.3.1. https://CRAN.R-project.org/package=readxl

[8] Hao Zhu (2021). kableExtra: Construct Complex Table with ‘kable’ and Pipe Syntax. R package version 1.3.4. https://CRAN.R-project.org/package=kableExtra

[9] Julien Barnier (2021). rmdformats: HTML Output Formats and Templates for ‘rmarkdown’ Documents. R package version 1.0.2. https://CRAN.R-project.org/package=rmdformats

[10] Silge, J. (2019, July 1). Reordering and facetting for ggplot2. Julia Silge. https://juliasilge.com/blog/reorder-within/

[11] Singer-Vine, J. (2021). Data Is Plural — 2021.04.21 edition. Data-is-plural.com. https://www.data-is-plural.com/archive/2021-04-21-edition/.

[12] tidytuesday/readme.md at master · rfordatascience/tidytuesday. (n.d.). GitHub. Retrieved August 27, 2021, from https://github.com/rfordatascience/tidytuesday/blob/master/data/2021/2021-04-27/readme.md

[13] Tierney N (2017). “visdat: Visualising Whole Data Frames.” JOSS, 2(16), 355. doi: 10.21105/joss.00355 (URL: https://doi.org/10.21105/joss.00355), <URL: http://dx.doi.org/10.21105/joss.00355>

[14] Wickham et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686

[15] Yihui Xie (2016). bookdown: Authoring Books and Technical Documents with R Markdown. Chapman and Hall/CRC. ISBN 978-1138700109